Approximate Range Queries for Clustering
نویسندگان
چکیده
We study the approximate range searching for three variants of the clustering problem with a set P of n points in d-dimensional Euclidean space and axis-parallel rectangular range queries: the k-median, k-means, and k-center range-clustering query problems. We present data structures and query algorithms that compute (1 + ε)-approximations to the optimal clusterings of P ∩Q efficiently for a query consisting of an orthogonal range Q, an integer k, and a value ε > 0.
منابع مشابه
بهبود الگوریتم انتخاب دید در پایگاه داده تحلیلی با استفاده از یافتن پرس وجوهای پرتکرار
A data warehouse is a source for storing historical data to support decision making. Usually analytic queries take much time. To solve response time problem it should be materialized some views to answer all queries in minimum response time. There are many solutions for view selection problems. The most appropriate solution for view selection is materializing frequent queries. Previously posed ...
متن کاملApproximate Clustering with Same-Cluster Queries
Ashtiani et al. proposed a Semi-Supervised Active Clustering framework (SSAC), where the learner is allowed to make adaptive queries to a domain expert. The queries are of the kind “do two given points belong to the same optimal cluster?”, and the answers to these queries are assumed to be consistent with a unique optimal solution. There are many clustering contexts where such same cluster quer...
متن کاملApproximate Answers for XML Queries with Range Predicates
In this paper, we tackle the difficult problem of summarizing the path/branching structure and numerical value content of an XML database. We introduce a novel, powerful XML-summarization model, termed VTreeSketches, that enables accurate approximate answers for the class of twig queries with numerical-range predicates. In a nutshell, a VTreeSketch synopsis represents an effective clustering of...
متن کاملClustering over Multi-Objective Samples: The one2all Sample
Clustering is a fundamental technique in data analysis. Consider data points X that lie in a (relaxed) metric space (where the triangle inequality can be relaxed by a constant factor). Each set of points Q (centers) defines a clustering of X according to the closest center with cost V (Q) = ∑ x∈X dxQ. This formulation generalizes classic k-means clustering, which uses squared distances. Two bas...
متن کاملRange-Clustering Queries
In a geometric k-clustering problem the goal is to partition a set of points in R into k subsets such that a certain cost function of the clustering is minimized. We present data structures for orthogonal range-clustering queries on a point set S: given a query box Q and an integer k > 2, compute an optimal k-clustering for S ∩Q. We obtain the following results. – We present a general method to...
متن کامل